YAM: A Step Forward for Generating a Dedicated Schema Matcher

نویسندگان

Fabien Duchateau

Zohra Bellahsene

چکیده

Discovering correspondences between schema elements is a crucial task for data integration. Most schema matching tools are semiautomatic, e.g., an expert must tune certain parameters (thresholds, weights, etc.). They mainly use aggregation methods to combine similarity measures. The tuning of a matcher, especially for its aggregation function, has a strong impact on the matching quality of the resulting correspondences, and makes it difficult to integrate a new similarity measure or to match specific domain schemas. In this paper, we present YAM (Yet Another Matcher), a matcher factory which enables the generation of a dedicated schema matcher for a given schema matching scenario. For this purpose we have formulated the schema matching task as a classification problem. Based on this machine learning framework, YAM automatically selects and tunes the best method to combine similarity measures (e.g., a decision tree, an aggregation function). In addition, we describe how user inputs, such as a preference between recall or precision, can be closely integrated during the generation of the dedicated matcher. Many experiments run against matchers generated by YAM and traditional matching tools confirm the benefits of a matcher factory and the significant impact of user preferences.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Encore un outil de découverte de correspondances entre schémas XML ?

In this paper, we present YAM, a schema matcher factory. YAM (Yet Another Matcher) is not (yet) another schema matching system as it enables the generation of a la carte schema matchers according to user requirements. These requirements include a preference for recall or precision and a training data set (a set of expert correspondences or a domain of interest). YAM uses a knowledge base that i...

متن کامل

Yet Another Matcher

Discovering correspondences between schema elements is a crucial task for data integration. Most matching tools are semi-automatic, e.g. an expert must tune some parameters (thresholds, weights, etc.). They mainly use several methods to combine and aggregate similarity measures. However, their quality results often decrease when one requires to integrate a new similarity measure or when matchin...

متن کامل

Yet Another Matcher

متن کامل

An Improved Semantic Schema Matching Approach

Schema matching is a critical step in many applications, such as data warehouse loading, Online Analytical Process (OLAP), Data mining, semantic web [2] and schema integration. This task is defined for finding the semantic correspondences between elements of two schemas. Recently, schema matching has found considerable interest in both research and practice. In this paper, we present a new impr...

متن کامل

YAM++ results for OAEI 2012

The YAM++ system is a self configuration, flexible and extensible ontology matching system. YAM++ takes advantages of many techniques coming from different fields such as machine learning, information retrieval, graph matching, etc. in order to enhance the matching quality. In this paper, we briefly present the YAM++ approach and its results on OAEI 2012 campaign. 1 Presentation of the system Y...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

Trans. Large-Scale Data- and Knowledge-Centered Systems

دوره 25 شماره

صفحات -

تاریخ انتشار 2016

YAM: A Step Forward for Generating a Dedicated Schema Matcher

نویسندگان

چکیده

منابع مشابه

Encore un outil de découverte de correspondances entre schémas XML ?

Yet Another Matcher

Yet Another Matcher

An Improved Semantic Schema Matching Approach

YAM++ results for OAEI 2012

عنوان ژورنال:

اشتراک گذاری